AITopics | kl property

Convergence of Cubic Regularization for Nonconvex Optimization under KL Property

Neural Information Processing SystemsMar-17-2026, 00:03:24 GMT

Cubic-regularized Newton's method (CR) is a popular algorithm that guarantees to produce a second-order stationary solution for solving nonconvex optimization problems. However, existing understandings of convergence rate of CR are conditioned on special types of geometrical properties of the objective function. In this paper, we explore the asymptotic convergence rate of CR by exploiting the ubiquitous Kurdyka-Lojasiewicz (KL) property of the nonconvex objective functions. In specific, we characterize the asymptotic convergence rate of various types of optimality measures for CR including function value gap, variable distance gap, gradient norm and least eigenvalue of the Hessian matrix. Our results fully characterize the diverse convergence behaviors of these optimality measures in the full parameter regime of the KL property. Moreover, we show that the obtained asymptotic convergence rates of CR are order-wise faster than those of first-order gradient descent algorithms under the KL property.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback

Convergence of Cubic Regularization for Nonconvex Optimization under KL Property

Neural Information Processing SystemsNov-20-2025, 22:52:10 GMT

Cubic-regularized Newton's method (CR) is a popular algorithm that guarantees to produce a second-order stationary solution for solving nonconvex optimization problems. However, existing understandings of convergence rate of CR are conditioned on special types of geometrical properties of the objective function. In this paper, we explore the asymptotic convergence rate of CR by exploiting the ubiquitous Kurdyka-Lojasiewicz (KL) property of the nonconvex objective functions. In specific, we characterize the asymptotic convergence rate of various types of optimality measures for CR including function value gap, variable distance gap, gradient norm and least eigenvalue of the Hessian matrix. Our results fully characterize the diverse convergence behaviors of these optimality measures in the full parameter regime of the KL property. Moreover, we show that the obtained asymptotic convergence rates of CR are order-wise faster than those of first-order gradient descent algorithms under the KL property.

asymptotic convergence rate, cubic regularization, nonconvex optimization, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback

A Globally Optimal Portfolio for m-Sparse Sharpe Ratio Maximization

Neural Information Processing SystemsOct-9-2025, 20:29:35 GMT

The convergence rates of PGA are also provided.

constraint, optimal solution, portfolio, (17 more...)

Neural Information Processing Systems

Country:

Asia > China > Guangdong Province > Guangzhou (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
Europe (0.04)

Genre:

Research Report > Experimental Study (0.46)
Research Report > New Finding (0.46)

Industry: Banking & Finance > Trading (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.96)

Add feedback

Accelerated Proximal Gradient Methods for Nonconvex Programming

Huan Li, Zhouchen Lin

Neural Information Processing SystemsOct-2-2025, 16:35:59 GMT

Nonconvex and nonsmooth problems have recently received considerable attention in signal/image processing, statistics and machine learning. However, solving the nonconvex and nonsmooth optimization problems remains a big challenge. Accelerated proximal gradient (APG) is an excellent method for convex programming. However, it is still unknown whether the usual APG can ensure the convergence to a critical point in nonconvex programming. In this paper, we extend APG for general nonconvex and nonsmooth programs by introducing a monitor that satisfies the sufficient descent property. Accordingly, we propose a monotone APG and a nonmonotone APG. The latter waives the requirement on monotonic reduction of the objective function and needs less computation in each iteration.

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington > King County > Seattle (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Enhancing Convergence of Decentralized Gradient Tracking under the KL Property

Chen, Xiaokai, Cao, Tianyu, Scutari, Gesualdo

arXiv.org Machine LearningDec-12-2024

We study decentralized multiagent optimization over networks, modeled as undirected graphs. The optimization problem consists of minimizing a nonconvex smooth function plus a convex extended-value function, which enforces constraints or extra structure on the solution (e.g., sparsity, low-rank). We further assume that the objective function satisfies the Kurdyka-{\L}ojasiewicz (KL) property, with given exponent $\theta\in [0,1)$. The KL property is satisfied by several (nonconvex) functions of practical interest, e.g., arising from machine learning applications; in the centralized setting, it permits to achieve strong convergence guarantees. Here we establish convergence of the same type for the notorious decentralized gradient-tracking-based algorithm SONATA. Specifically, $\textbf{(i)}$ when $\theta\in (0,1/2]$, the sequence generated by SONATA converges to a stationary solution of the problem at R-linear rate;$ \textbf{(ii)} $when $\theta\in (1/2,1)$, sublinear rate is certified; and finally $\textbf{(iii)}$ when $\theta=0$, the iterates will either converge in a finite number of steps or converges at R-linear rate. This matches the convergence behavior of centralized proximal-gradient algorithms except when $\theta=0$. Numerical results validate our theoretical findings.

algorithm, convergence, kl property, (15 more...)

arXiv.org Machine Learning

2412.09556

Country:

North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Communications > Networks (0.93)

Add feedback

Reviews: Convergence of Cubic Regularization for Nonconvex Optimization under KL Property

Neural Information Processing SystemsOct-8-2024, 01:02:49 GMT

The paper investigates the convergence of different measurement of cubic regularization method for non-convex optimization under KL property. It consists with a list of work on CR methods based on the analysis of Nesterove el.s' work. Since the type of methods can guarantee the convergence to the second-order stationary point, it is quite popular also considering the raising of training neural networks. The paper is well-written, clear-organized and the theorems and proofs are easy to follow. Note that this is a pure theoretical work i.e., without new algorithms and/or numerical experiments.

artificial intelligence, machine learning, nonconvex optimization, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.62)

Add feedback

Efficient Low-rank Identification via Accelerated Iteratively Reweighted Nuclear Norm Minimization

Wang, Hao, Wang, Ye, Yang, Xiangyu

arXiv.org Artificial IntelligenceJun-26-2024

This paper considers the problem of minimizing the sum of a smooth function and the Schatten-$p$ norm of the matrix. Our contribution involves proposing accelerated iteratively reweighted nuclear norm methods designed for solving the nonconvex low-rank minimization problem. Two major novelties characterize our approach. Firstly, the proposed method possesses a rank identification property, enabling the provable identification of the "correct" rank of the stationary point within a finite number of iterations. Secondly, we introduce an adaptive updating strategy for smoothing parameters. This strategy automatically fixes parameters associated with zero singular values as constants upon detecting the "correct" rank while quickly driving the rest of the parameters to zero. This adaptive behavior transforms the algorithm into one that effectively solves smooth problems after a few iterations, setting our work apart from existing iteratively reweighted methods for low-rank optimization. We prove the global convergence of the proposed algorithm, guaranteeing that every limit point of the iterates is a critical point. Furthermore, a local convergence rate analysis is provided under the Kurdyka-{\L}ojasiewicz property. We conduct numerical experiments using both synthetic and real data to showcase our algorithm's efficiency and superiority over existing methods.

algorithm, singular value, theorem 3, (15 more...)

arXiv.org Artificial Intelligence

2406.15713

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Occitanie > Hérault > Montpellier (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Henan Province > Zhengzhou (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.45)

Add feedback

A majorized PAM method with subspace correction for low-rank composite factorization model

Tao, Ting, Qian, Yitian, Pan, Shaohua

arXiv.org Machine LearningJun-6-2024

This paper concerns a class of low-rank composite factorization models arising from matrix completion. For this nonconvex and nonsmooth optimization problem, we propose a proximal alternating minimization algorithm (PAMA) with subspace correction, in which a subspace correction step is imposed on every proximal subproblem so as to guarantee that the corrected proximal subproblem has a closed-form solution. For this subspace correction PAMA, we prove the subsequence convergence of the iterate sequence, and establish the convergence of the whole iterate sequence and the column subspace sequences of factor pairs under the KL property of objective function and a restrictive condition that holds automatically for the column $\ell_{2,0}$-norm function. Numerical comparison with the proximal alternating linearized minimization method on one-bit matrix completion problems indicates that PAMA has an advantage in seeking lower relative error within less time.

algorithm 1, col, sequence, (16 more...)

arXiv.org Machine Learning

2406.04588

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback

Accelerated Proximal Gradient Methods for Nonconvex Programming

Neural Information Processing SystemsMar-13-2024, 05:16:43 GMT

Nonconvex and nonsmooth problems have recently received considerable attention in signal/image processing, statistics and machine learning. However, solving the nonconvex and nonsmooth optimization problems remains a big challenge. Accelerated proximal gradient (APG) is an excellent method for convex programming. However, it is still unknown whether the usual APG can ensure the convergence to a critical point in nonconvex programming. In this paper, we extend APG for general nonconvex and nonsmooth programs by introducing a monitor that satisfies the sufficient descent property. Accordingly, we propose a monotone APG and a nonmonotone APG. The latter waives the requirement on monotonic reduction of the objective function and needs less computation in each iteration.

algorithm, apg, convergence, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington > King County > Seattle (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

An inexact LPA for DC composite optimization and application to matrix completions with outliers

Tao, Ting, Liu, Ruyu, Pan, Shaohua

arXiv.org Machine LearningNov-21-2023

This paper concerns a class of DC composite optimization problems which, as an extension of convex composite optimization problems and DC programs with nonsmooth components, often arises in robust factorization models of low-rank matrix recovery. For this class of nonconvex and nonsmooth problems, we propose an inexact linearized proximal algorithm (iLPA) by computing in each step an inexact minimizer of a strongly convex majorization constructed with a partial linearization of their objective functions at the current iterate, and establish the convergence of the generated iterate sequence under the Kurdyka-\L\"ojasiewicz (KL) property of a potential function. In particular, by leveraging the composite structure, we provide a verifiable condition for the potential function to have the KL property of exponent $1/2$ at the limit point, so for the iterate sequence to have a local R-linear convergence rate. Finally, we apply the proposed iLPA to a robust factorization model for matrix completions with outliers and non-uniform sampling, and numerical comparison with the Polyak subgradient method confirms its superiority in terms of computing time and quality of solutions.

artificial intelligence, kl property, optimization problem, (18 more...)

arXiv.org Machine Learning

2303.16822

Country:

North America > United States > New York (0.04)
Europe > Spain > Aragón (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Filters

Collaborating Authors

kl property

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Convergence of Cubic Regularization for Nonconvex Optimization under KL Property

Convergence of Cubic Regularization for Nonconvex Optimization under KL Property

A Globally Optimal Portfolio for m-Sparse Sharpe Ratio Maximization

Accelerated Proximal Gradient Methods for Nonconvex Programming

Enhancing Convergence of Decentralized Gradient Tracking under the KL Property

Reviews: Convergence of Cubic Regularization for Nonconvex Optimization under KL Property

Efficient Low-rank Identification via Accelerated Iteratively Reweighted Nuclear Norm Minimization

A majorized PAM method with subspace correction for low-rank composite factorization model

Accelerated Proximal Gradient Methods for Nonconvex Programming

An inexact LPA for DC composite optimization and application to matrix completions with outliers